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ABSTRACT 

D ■ 

We use both an Hl-selected and an optically-selected galaxy sample to directly 
measure the abundance of galaxies as a function of their "baryonic" mass (stars 
+ atomic gas). Stellar masses are calculated based on optical data from the 
Q '. Sloan Digital Sky Survey (SDSS) and atomic gas masses are calculated using 

! atomic hydrogen (HI) emission line data from the Arecibo Legacy Fast ALFA 

(ALFALFA) survey. By using the technique of abundance matching, we combine 
the measured baryonic function (BMF) of galaxies with the dark matter halo 
mass function in a ACDM universe, in order to determine the galactic baryon 
fraction as a function of host halo mass. We find that the baryon fraction of low- 
iJHjI mass halos is much smaller than the cosmic value, even when atomic gas is taken 

\ into account. We find that the galactic baryon deficit increases monotonically 

with decreasing halo mass, in contrast with previous studies which suggested an 
^ \ approximately constant baryon fraction at the low-mass end. We argue that the 

observed baryon fractions of low mass halos cannot be explained by reionization 
• | heating alone, and that additional feedback mechanisms (e.g. supernova blowout) 

must be invoked. However, the outflow rates needed to reproduce our result are 
not easily accommodated in the standard picture of galaxy formation in a ACDM 
universe. 

t. 

1. Introduction 

It is by now well established th at baryonic matter rep resents only about 1/6 of the total 



matter density of the universe (e.g. iKomatsu et al.ll20!ll ). while the majority is in the form 



1 Center for Radiophysics and Space Research, Space Sciences Building, Cornell University, Ithaca, 
NY 14853, USA. e-mail: papastergis@astro.cornell.edu, shan@astro.cornell.edu, riccardo@astro.cornell.edu, 
haynes@astro . Cornell . edu 

2 Laboratoire d'Astrophysique de Marseille, UMR 6110 CNRS, Univ. d'Aix-Marseille, 38 rue F. Joliot- 
Curie, 13388 Marseille cedex 13, France e-mail: andrea.cattaneo@oamp.fr 



-2- 



of non-baryonic dark matter (DM). Since galaxies form through the accretion of baryonic 
material onto dynamically dominant DM structures (halos), it would be reasonable to assume 
that the baryon mass fraction of present day galaxies approximately equals the cosmic value 
(f b = Q b /Q m « 0.16). Despite this expectation, observations point to the fact that galaxies 
are not able to retain their cosmic "fair share" of baryons, and that the resulting baryon 
deficit depends strongly on the mass of their host halo. 

The first line of evidence is provided by observational estimates of the abundance of 
galaxies as a function of their total stellar mass, a distribution referred to as the galactic stel- 
lar mass function (SMF). Thanks to the advent of wide area optical surveys with multiband 
photometric and spectroscopic information, such as the Two degree Field Galaxy Redshift 
Survey (2dFGRS) and the Sloan Digital Sky Survey (SDSS), the SMF has been measured 



over the mass range M* « 10 — 10 M , using statistica l samples o 



galaxies and a variety of stellar mass estimation techniques (ICole et al. 



Panter et al.ll2007l . iBaldrv et al.ll2008l . iLi & Whitdl2009l . lYang et al. 



tens of thousands o 



2001 



2009. 



Bell et al. 



Baldry et al 



2003 



2012 



to name a few). The SMF displays an exponential cutoff at masses M* > lO n M and an 
approximate power-law behavior at low masses (dn cx M~ a dM*) , with a "shallow" exponent 
of a ~ —1.3. On the other hand, the halo mass function (HMF) predicted in the lambda 
cold dark matter (ACDM) model, follows a much "steeper" power-law (a ~ —1.8) over the 
mass range of interest. This observation alone excludes the possibility that the stellar mass 
of a galaxy is simply a fixed fraction of the host halo mass. 

One can furthermore statistically derive an average relation between the stellar mass of a 
galaxy (M#) and the mass of its host halo (Mh), through the technique of abunda nce match 
ing ( s ee Ej5.ll for details) . M* - Mh relations based on abundance ma tching (e.g. iGuo et al. 



20 id iMoster et al J 120101 : iBehroozi et all 120101 : iLeauthaud et all 120121 1 have shown that the 
"stellar conversion efficiency", 77* = (M*/M/,) / /&, never exceeds 25 - 30%. Furthermore, 77* 
peaks for Milky Way-size d galaxies {Mh y 10 12 M Q ), and declines rapidly on either side of 
the peak (e.g. Figure 2 in IGuo et al.ll201Q[ ). 



through weak lensing or kin ematics studies (e.g 



Dutton et al. 


2010; 


Reves et al. 


, — 
2012) 



example, iReyes et al.l (120121 ) used stacked weak lensing measurements to estimate the aver- 
age host halo mass of galaxies in different stellar mass bins, and found that 77* never exceeds 
~ 30%. Direct halo mass measurements can circumvent a number of assumptions inherent 
in the application of abundance matching, but such techniques can presently only be applied 
to a restricted range of stellar mass (M* ~ 10 9 — 10 11 M ), and are affected by their own set 
of systematics. 



Stellar mass is not always the dominant baryonic component in a galaxy. In fact, the HI- 
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to-stellar mass ratio ("HI fraction"; fjji = Mhi/M*) tends to increase with decreasing stellar 
mass, and HI often dominates the baryonic content of low-mass galaxies. The transition 
from stellar-mass-do minated to Hi-dom inated systems takes place at M* « 1O 1O M for HI- 
selected samples (e.g. Huang et al. 20121 see also Fig. ITQl m this work), or at M* < 10 9 5 M ( 



.... .. 

for optically-selected samples (e.g. ICatinella et al.ll2010l ). As a result, it is presently not 
clear what is the behavior of the "baryon retention fraction" rjb = (Mb/Mh) / /& in low-mass 
galaxies, when both stars and cold gas are taken into account. In particular it is not well 
understood whether the very low average value of r]* inferred for low-mass halos is a result 
of poor retention of baryonic material, of the low efficiency of gas-to-stars conversion, or of 
a combination of both. 



For example, iBaldry et al.l ( 120081 ) argue that the increasing gas fraction in low-mass 
galaxies should approximately offset the decreasing stellar-to-halo mass ratio, and result 
in a roughly constant rji, ~ 10%. This conclusion was based on an indirect estimate of 
the cold gas content of galaxies, based on the ave rage fur ~ M* relation observed in a 



set of samples in the literature. An early work by ISalucci fc Persia (ll999|). based on the 
same indirect method, also reached a qualitatively similar conclusion. lEvoli et al.l (120111 ) 
found an approximately constant t]b at the low-mass end using a different indirect method, 
which involves the comparison of the stellar and HI mass distributions of two different 
galaxy samples. These results would imply that low-mass galaxies are relatively efficient 
at retaining baryonic mass, but very inefficient in converting their gas into stars. This 
conclusion, however, would requ ire a "steep" HI mass function (HIMF ) in the local universe, 
in contr ast to what is measured (Zwaan et al.ll2005l ; iMartin et al.ll2010l ). Moreover, the recent 
work of iRodriguez-Puebla et al.l ( 120111 ). also based on using the average fni — M* relations 
for blue and red galaxies separately, found no signs for a flat r\b at low masses. 

In this article we directly measure the abundance of galaxies as a function of their 
"baryonic mass" (throughout this article the term baryonic refers to the combined stellar and 
atomic gas components of galaxies, and baryonic mass is calculated as Mb = M* + IAMhi, 
where the 1.4 factor accounts for the presence of helium). We use optical data from the 
seventh data release of the SDSS (SDSS DR7) to estimate stellar masses, and Hl-line flux 
measurements from the Arecibo Legacy Fast ALFA0 (ALFALFA) survey to measure atomic 
gas masses. The resulting distribution, referred to hereafter as the baryon mass function 
(BMF) of galaxies, can be used in abundance matching to derive a robust T]b - Mh relation. In 
order to investigate sample selection effects, we employ both an Hl-selected and an optically- 
selected sample drawn from the same volume to derive the mass distributions for the stellar, 
atomic hydrogen and baryonic components. 



1 The Arecibo L-band Feed Array (ALFA) is a 7-feed receiver operating in the L-band (~ 1420 MHz), 



-4- 



The paper is organized as follows: in section [21 we introduce the datasets used to 
measure the stellar, HI and baryon mass distributions. We describe the methodology used 
to measure atomic hydrogen masses and we estimate stellar masses for our galaxy samples. 
In section [21 we present our measurements of the SMF, HIMF & BMF from both the HI- 
selected and the optically-selected samples, and compare them against one another as well as 
against other published results. In section [4] we consider the impact of possible systematics 
on our measurements, such as stellar mass estimation method, distance uncertainties and the 
exclusion of some baryonic components (e.g. molecular gas) in the calculation of the BMF. 
In section O we present the 77* - Mh and rjb - Mh relation in a ACDM universe. In section [61 
we discuss the implications of the result and summarize our main conclusions. Throughout 
this paper, we use a Hubble constant of H = 70 h 70 kms _1 Mpc _1 . 



2. Datasets &c derived quantities 
2.1. Hl-selected sample 



We select galaxies from the current data release of t he ALFALFA survey , which covers 
40% of the planned final survey area ("a. 40" catalog; lHaynes et al.l 1201 ll ). We restrict 
ourselves to two rectangular areas of the "spring" coverage of a. 40 (07 h A5 m < RA < 16 30 m , 
4° < Dec < 16° & 24° < Dec < 28°), which encompass the Virgo cluster as well as the 
supergalactic plane at low velocities. We restrict ourselves to galaxies with vcmb < 15000 
km s _1 (z < 0.05), in order to avoid the strong radio frequency interference (RFI) present 
at frequencies that correspond to v & > 15000 km s _1 . We discard the nearest extragalactic 
sources with D < 10 Mpc, because they can carry extreme fractional uncertainties on their 
distances (see §2.31 for details on the distance assignment method). We furthermore select 
only HI sources designated as "Code 1" in a. 40, i.e. extragalactic sources detected at high 
significance (S/N HI > 6.5). In addition, we exclude so urces with integrate d fluxes below the 
50% completeness limit of the ALFALFA survey (see lHaynes et al.ll201ll . Section 6 for the 
derivation of the ALFALFA completeness limits). The above requirements are satisfied by 
7618 galaxies. 

We remove from our sample 204 a. 40 sources which are not crossmatched with an optical 
source in SDSS, as well as 208 additional sources which have been flagged as ha ying problem- 
atic S PSS photometry (crossmatch code "P" in a. 40, for details see Section 4 in lHaynes et al. 



201ll ). This quality cut on the SDSS photometry introduces some bias against faint, low 



installed at the Arecibo Observatory. 
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surface brightness galaxies of irregular morphology; such sources are often "shredded" (i.e. 
assigned multiple photometric objects) by the SDSS magnitude extraction process, and are 
usually assigned a "P" ("photometry suspect") crossmatch code in a. 40. Lastly, 11 addi- 
tional objects were discarded, in cases where the stellar mass computation method described 
in §2.31 failed to produce physically plausible results. 

Our final sample thus consists of 7195 extragalactic objects, detected over ^2000 deg 2 of 
high Galactic latitude sky and out to D ps 214 Mpc. The upper panel of Figured] displays the 
spatial distribution of our Hl-selected galaxies, and puts in evidence the complex large scale 
structure in the survey volume. Note that all objects in our Hl-selected sample have 21cm 
redshifts § and line fluxes as well as multi-band optical photometry, and hence estimates of 
both their stellar and atomic hydrogen masses. 



2.2. Optically-selected sample 



We draw an optically-selected sample from the SDSS DR7 (lAbazajian et al.ll2009f ) spec- 
troscopic database, in the same volume used to define our Hl-selected sample. Specifically, we 
select galaxies that lie within the same sky area (07 h 45 m < RA < lQ h 30 m , 4° < Dec < 16° & 
24° < Dec < 28°) and satisfy the same velocity and distance restrictions (vcmb < 15000 km 
s _1 , D > 10 Mpc). CMB velocities for our optically-selected galaxies are calculated based 
on their SDSS spectroscopic redshifts (zsdss)- We restrict ourselves to objects spectroscop- 
ically classified as galaxies in SDSS (specClass = 2) that also have an apparent Petrosian 
magnitude brighter than 17.5 in the r-band (r petro < 17.5). This initial cut results in 22707 
galaxies. Due to their large number, it is not practical to inspect all galaxies individually for 
the quality of their SDSS photometry/spectroscopy. As a result, we expect a fraction of our 
sources to have issues with their SDSS photometry, most often related to "shredding" (i.e. 
assignment of multiple photometric objects to a single galaxy). This issue affects mostly 
extended sources with structure in their light distribution, such as low surface brightness 
(LSB) galaxies with irregular morphology. In such cases, the SDSS magnitude will under- 
estimate the true flux of the galaxy, which in turn will result in an underestimate of its 
stellar mass. A second issue related to shredding, is that bright star forming knots in the 
disks of nearby spiral galaxies can sometimes be cataloged as separate spectroscopic objects, 
and hence interpreted as low-mass satellites of the main spiral. We find that applying a 
color cut on our sample, (i — z) mode i > —0.25, removes a fair fraction of these unwanted 



2 Of the 7195 galaxies in the Hl-selected sample, 1333 are not in the SDSS DR7 spectroscopic database 
and thus lack SDSS optical redshifts. 
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Fig. 1. — Spatial distribution of the 7195 Hl-selected galaxies (upper panel) and 22587 
optically-selected galaxies (lower panel), drawn from the same volume. The galaxy stellar 
mass function (SMF), HI mass function (HIMF) and baryonic mass function (BMF) are 
computed separately for the two samples, in order to assess the impact of sample selection 
on the derived distributions. 
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cases. On the other hand, cuts based on the quality of the SDSS spectrum (such as cuts on 
zconf , zstatus or zwarning) are ineffective, since they exclude mostly legitimate faint or 
LSB dwarf galaxies with noisy spectra. Lastly, we exclude objects for which the stellar mass 
computation described in §2.31 failed to produce physically plausible results. 

Our final optically-selected sample consists of 22587 galaxies, occupying the same volume 
as our Hl-selected sample. We crossmatch the optical sample with the full a. 40 catalog 
(including Code 1 & 2 sources), and find 7551 HI source count erparts. The crossmatch rate 



is thus approximately 1/3, as reported in lHaynes et al.l (120111 ). Note that ALFALFA non 



detected galaxies are not necessarily Hi-poor objects; due to the low emissivity of atomic 
hydrogen in the 21cm line, even moderately gas-rich galaxies in the outer portion of the 
survey volume can be missed by ALFALFA. This point is illustrated by the lower panel of 
Figure [U which compares the spatial distribution of galaxies in the Hl-selected and optically- 
selected samples. Note that all optically-selected galaxies have multiband optical photometry 
as well as optical redshifts, and hence an estimate of their stellar mass. However, only galaxies 
crossmatched with an a. 40 source have a 21cm flux measurement, and hence a value for their 
atomic hydrogen mass. 



2.3. Derived quantities 

We calculate HI masses from the measured 21cm integrated flux reported in a. 40. As- 
suming optically thin emission 



M HI = 2.356 10 5 S mt D 2 



where Mhi is the HI mass in units of the solar mass (M Q ), Si n t is the integrated flux in 
Jykms -1 and D is the dis tance in Mpc. Dista nces in this article are calculated according to 



the method used in a. 40 (lHaynes et al.ll201ll ): nearby galaxies (vcmb < 6000 km s 1 ) are 



assign ed distances through the use of a peculiar velocity flow model developed by iMasters 



(120051 ). while for more distant galaxies simple Hubble distances are used (D = vcmb/Hq, 
with Hq = 70 kms~ 1 Mpc _1 ). Moreover, group and cluster membership information is taken 
into account when available, as well as primary distance measurements published in the lit- 
erature. We would like to point out that most of the galaxies in our optically-selected sample 
are not included in a. 40, and hence lack the systematic group assignments and primary dis- 
tance information contained in the catalog. Nevertheless, optically-selected galaxies that lie 
within the sky area and redshift range of the Virgo cluster are placed collectively at the Virgo 
distance (D = 16.5 Mpc), in order to minimize the effects of peculiar motions on the inferred 
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Fig. 2. — Histograms of i-band absolute magnitude (panel a), g — i color (panel b), stellar 
mass (panel c) and HI mass (panel d), for the optically-selected (red solid line) and HI- 
selected (blue dashed line) samples. As evident in panel b, the Hl-selected sample is strongly 
biased against the red galaxy population and as a result it is skewed towards lower luminosity 
and stellar mass systems. Conversely, the fractional contribution of bright and massive 
galaxies (Mj < —19, M* > 1O 9 M ) is larger for the optically-selected sample (panels a and 
c). 
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distances of galaxies in the region. We would also like to note that the distance assignment 
method can have a large impact on the determination of mass functions, especially at the 
low-mass end. We illustrate this issue in §4.2} where we consider the effect on the HIMF of 
using uniformly Hubble distances for all galaxies. 

We compute stellar masses for our galaxies based on fitting all 5 SDSS photometric 
bands (u, g, r, i,z), with model spectra l energ y distributions (SEDs). The full details of the 
method can be found in iHuang et al.l (120121 ). but here we summ a rize t he main points: a 
library of model SEDs are gene rated, using thelBruzual fc Charlotl (120031 ) stellar population 
synthesis code and assuming a IChabrierl ( 120031 ) stellar initial mass function (IMF). Models 
with an extensive range of internal extinction, metallicity and star formation histories are 
considered. In particular, star formation history templates include both an exponentially 
declining component as well as random starburst episodes. The final physical properties (e.g. 
stellar mass, star formation rate, internal extinction etc.) are computed as the average of 
all model values, where each model is weighted according to its fit likelihood. In addition to 
mean values, "la" uncertainties of the physical properties can also be derived, as one quarter 
of the 2.5-97.5 percentile range of model values. The median la uncertainty in logM* is 
0.086 dex, or about 22% (excluding uncertainties on the distance). It is important to note 
that stellar mass estimates of the same galaxy obtained with different methods can have 
systematic offsets of up to factors of a few. In §4.11 we address issues relate d to stellar mass 



estimation, and co nsider alternative methods for calculating stellar masses ( IBell et al.l 12003 
Tavlor et alfeoilh . 



Figure |2] compares the distributions of i-band absolute magnitude (Mi) and g — i color 
(both corrected for Milky Way extinction), stellar mass (M*) and HI mass (Mhi) for the 
Hl-selected and optically-selected samples. The most notable difference is in the g — i 
color distribution, w here the Hl-selecte d sample shows a strong bias against the red galaxy 
population (see also lHuang et al.l 120121 ). As a result, the optically-selected sample contains 
a larger proportion of high luminosity and stellar mass systems compared to the Hl-selected 
sample. By contrast, the Mhi distribution of the two samples is very similar, but remember 
that only those optically-selected galaxies that are detected in ALFALFA are included in the 
histogram. 
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3. The baryonic mass function 
3.1. Method 

Stellar masses for all galaxies, and HI masses for all ALFALFA-detected galaxies are 
calculated as described in §2.31 For the ~ 15000 galaxies in the optically-selected sample 
that lack an ALFALFA detection, we assign a lower and an upper limit on their atomic 
hydrogen content (Mjy} n , M^f*). The lower limit is simply M^\ n = 0, which corresponds 
to an Hi-devoid galaxy. The upper limit is calculated by assuming that the HI flux of the 
non-detected galaxy lies just below the ALFALFA "detection limit" , as defined by the 25% 
completeness limit of the a. 40 catalog when both Code 1 & 2 sources are considered. More 
precisely 



log m%t 



5.372 + log Hm + 2 log D 



(2) 



where D is the galaxy distance in Mpc determined by its SDSS optical redshift, and Sf^f hn ' 



is the flux level at which th e completeness of th e a. 40 catalog falls to 25%, in Jy km s 
According to Eqns. 6 & 7 of lHaynes et al.l ( 1201 ll ) 



-i 



0.5 log Wm- 1-312 logW 50 ,^2.5 



g mt ~ \ log W 50 - 2.562 log W 50 , > 2.5 1 ] 

where W50 is the Hl-line profile width in km s _1 , measured at the 50% flux level of the profile 
peak. Since ALFALFA non-detected galaxies lack a measurement of W50, we assign a value 
based on the average M*- v rot relation (i.e. the stellar mass Tully-Fisher relation) of a. 40 
galaxies. We then project the v rot value on the line-of-sight according to the SDSS r-band 
axial ratio, and assuming an intrinsic axial ratio of go — 0.13 for all galaxies. 

Baryonic masses (i.e. stellar mass + atomic gas mass) for all galaxies are calculated 
as Mj, = M* + 1.4 Mhi, where the 1.4 factor is used to account for the cosmic abundance 
of helium. Note that ALFALFA non-detected galaxies have two assigned values for their 
HI mass, and consequently two values for their baryonic mass, M™ m = M* and M™ ax = 
M* + 1.4Mgf. 

We calculate cumulative mass functions in logarithmic mass bins for all three compo- 
nents (i.e. stellar mass, atomic hydrogen mass, baryonic mass), separately for the Hl-selected 
and optically-selected samples. Since neither sample is volume-limited, mass functions have 
to take into account the sample selection criteria as well as the large-scale structure in the 
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survey volume. HI selection is based on a combination of gal actic HI integrated flux, Si n t, 
and profile width, W 50 (see §2.11 fc discussion in Section 6 of lHaynes et al.l 120 111 ); as a re- 
sult, galaxies of different HI masses and linewidths are detected out to different distances. 
Similarly, our optically-selected sample is a flux-limited sample, which results in galaxies 
with different r-band absolute magnitudes being detected in different volumes. As a result, 
mass distributions are calculated by summing up the number of detections in a given mass 
bin (see Fig. [2]), with each detection weighted by an appropriate volume facto r. Individual 
weigh ting factors are calculated via the "1/V e //" method, as implemented in IZwaan et al. 



2005 



This is a n on-parametric, maximum-likelihood method, which reduces to the standard 
method (jSchmidtlll968[ ) when applied to a spatially homogeneous galactic sample. 
The advantage of 1/K// consists in the fact that it is insensitive to local density fluctuations, 
and hence mostly immune to structure-induced bias. We refer the interested reader to the 
literature for the definition, implem entation and deta i ls of t he method. The method defini- 
tion and basic setup is described in Efstathiou et al. dl988h; de t ails o f the implementation 
of the method for HI data can be found in IZwaan et al.l (120031. 120051); the specifi cs of the 
application of the method on ALFALFA data can be found in iMartin et al.l ( 120101 ). while a 



shorter qualitative description can be found in iPapastergis et al.l ( 120111 ). 



Lastly, a fraction of galaxies that satisfy all criteria for SDSS spectroscopic followup 
cannot be observed for technical reasons (mostly fiber collisions), and are therefore not 
included in the SDSS spectroscopic database. We therefore correct the normalization of all 
optically-selected distribution s by 1 / < f spec >, using the average spectroscopic completeness 



value reported in lLi fc White! ( 120091 ). < f spec >= 0.92. Similarly, a fraction of the ALFALFA 



volume is "lost" due to RFI contamination of certain frequency bands in the ALFALFA 
passband. We correct the normalization of all Hl-selected distributions by 1/(1 — /rfi), 
where fnpi = 0.03. We would also like to note that, due to the 4' beam size of the ALFA 
receiver, a number of HI sources are expected to be blended. We do not attempt to correct 
for blending but, given that Hl-selected galaxies are a weakly clustered population (e.g. 
Martin et al.ll2012l ). we anticipate the effect on the Hl-selected distributions to be small. 



3.2. Results 

Figure [3] shows the cumulative distribution^ of stellar mass (SMF; gold symbols), HI 
mass (HIMF; cyan symbols) and baryonic mass (BMF; black symbols), derived from the Hl- 
selected galaxy sample. The Hl-selected BMF follows closely the Hl-selected SMF at high 
masses, while at the low-mass end the contribution of the HIMF becomes dominant; this is 
because Hl-selected galaxies become more gas-rich as their stellar mass decreases. Figure H] 
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Fig. 3. — The cumulative distributions of stellar mass (SMF, gold symbols), atomic hydrogen 
mass (HIMF, cyan symbols) and baryonic mass calculated as Mb = M* + 1.4 Mhi (BMF, 
black symbols), derived from the Hl-selected galaxy sample. Error bars represent just the 
Poisson counting error assuming independent errors among different mass bins. 
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Fig. 4. — The cumulative distribution of stellar mass (gold line), atomic hydrogen mass 
(cyan lines) and baryonic mass calculated as M& = M* + IAMhi (black lines), derived from 
the optically-selected galaxy sample. The atomic hydrogen and baryonic distributions are 
represented as allowed ranges, based on estimates of the minimum and maximum HI mass 
for galaxies that are not detected by ALFALFA ({M%j n , M%? x }, see $L3]for details). Error 
bars again represent just the Poisson counting error assuming independent errors among 
different mass bins. 
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shows the same distributions (SMF, gold line; HIMF, cyan lines; BMF, black lines) derived 
from the optically-selected sample. Recall that, in the case of the optically-selected sample, 
a lower and upper limit of the HIMF and BMF are shown, since SDSS galaxies that are 
undetected by ALFALFA are assigned an upper and lower limit on their HI content, and 
therefore also on their baryonic mass (see §2.3p . 




7 8 9 10 11 12 

log M, (h ro 2 M sun ) 



Fig. 5. — Comparison of the differential galactic stellar mass function (SMF) derived from 
the Hl-selected (gold symbols) and optically-selected (gold line) galaxy samples. Error bars 
represent just the Poisson counting error on individual mass bins. The optically-selected 
SMF is systematically higher than the Hl-selected SMF at the high-mass (M* > 10 11 M & ) 
and low-mass (M* < 1O 8,5 M ) ends. This difference is mostly due to the bias of the Hl- 
selected sample against the red galaxy population (see §3.21 for a detailed discussion). 

In Figure [5] we compare the SMFs derived from the optically-selected and Hl-selected 
samples. The two SMFs are consistent at intermediate masses, but the optically-selected 
SMF (gold line) is systematically higher at the high-mass and low-mass ends. At high 
masses the discrepancy is due to the bias of the Hl-selected sample against the most massive 
galaxies, which are usually red passive systems. The discrepancy at the low-mass end is 
mostly due to the population of red-sequence dwarf galaxies in the nearby Virgo clustei@ 



3 We show cumulative mass functions in Figs. & El because the cumulative -and not the differential- 
distributions are directly related to the stellar and baryon galactic fractions computed in Sec. All other 
figures however show differential mass functions, which best display the details and errors of the distributions. 
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Fig. 6. — Comparison of the differential HI mass function (HIMF) derived from the HI- 
selected {cyan symbols) and optically-selected {cyan lines) galaxy samples. Error bars rep- 
resent just the Poisson counting error on individual mass bins. The HIMFs derived from 
the two samples are mostly consistent, with the Hl-selected HIMF having a slightly steeper 
low-mass end slope than that suggested by the optically-selected HIMF range. See §3.2! for 
a detailed discussion. 
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Fig. 7. — Comparison of the differential baryonic mass function (BMF) derived from the 
Hl-selected (black symbols) and optically-selected (black lines) galaxy samples. Error bars 
represent just the Poisson counting error on individual mass bins. The optically-selected 
BMF is mostly consistent with the Hl-selected BMF, except at the high-mass end. This is a 
direct result of the discrepancy between the optically-selected and Hl-selected SMFs at high 
masses (Fig. [5]). See §3.2! for a detailed discussion. 
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that are undetected by ALF ALFA; these are mostly dwarfs with early-type morphologies 
and very low HI content (see lHallenbeck et al.ll2012l . for example). On the other hand, the 
Hl-selected and optically-selected HIMFs (Fig. E]) are in good agreement with one another, 
with the Hl-selected HIMF having a slightly steeper low-mass end slope than what suggested 
by the range of the optically-selected distribution. The two BMFs are mostly in agreement 
with one another, except at the high-mass end (a factor of ~4 at M& = 10 11 ' 5 M®). This is a 
direct consequence of the discrepancy between the optically-selected and Hl-selected stellar 
mass functions at high masses. Note that there is little difference between the two BMFs at 
low masses, which suggests that the low-mass end of the BMF has been measured robustly. 



3.3. Comparison with other work 



Figure |8] compares the optically-selected S MF presented i n this work (sa me as gold 



l ine in Fig. \5§ with the local-universe SMF of iBaldry et al.l ( 120081 ) and the lYang et al. 



( 120091 ) SMF, w hich are both based on the New York University Value-Added Galaxy Catalog 
(NYU-VAGC; iBlanton et aD I2OO5I M There is excellent agreement between the Baldry et 
al. SMF and our optically-selected SMF, especially at intermediate and low stellar masses 
(M* < 1O U M ). The deviations at high masses are due to the fact that stellar masses in 



Baldry et al. are calculated differently than in this work (see Sec. 3 of IBaldry et al.l 12008 



for details); note that a systematic difference of just 0.1 dex (26%) in stellar mass would be 
enough to explain the observed difference in abundance at the high-mass end. 



The lYang et al.l ( 120091 ) SMF is systematically higher than our optically-selected SMF 
at high masses, and displays a more pronounced "plateau" at intermediate masses. It is not 
clear what the cause of the difference at the high-mass end of the distributions is, but it may 
relate to the fact that the Yang et al. SMF is extracted from a significantly larger volume 
than our measurement (the maximum redshift is z = 0.2 for the Yang et al. sample and 
z = 0.05 for the sample used in this work). Moreover, Yang et al. use the prescription of 
Bell et al.l (120031 ) to estimate stellar masses, which is based on the galactic g — r color. As 
discussed in more detail in §4.1[ the use of different stellar mass estimators can significantly 
affect the shape of the measured SMF. 



Figure [9] compares the Hl-selected HIMF presented in this work (same as the cyan sym- 



4 The presence of a massive cluster (Virgo cluster) at a distance of just 16.5 Mpc from the observer makes 
the ALFALFA survey volume different from an average cosmological volum e. The effect of the presence of 
the Virgo cluster on the HIMF has been investigated bv Martin et al.l (|2010L §6.1), who found however only 
minor effects. More generally, the issue of cosmic variance regarding ALFALFA statistical distributions is 
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Fig. 8. — The gold solid line with errorbars represents the differential SMF derived in this 
work from the opt ically-selected sampl e (same as gold line in Fig. [5]). The red diamonds 
correspond to the iBaldry et all (120081 ) S MF in the local universe (z < 0.06), while the 
purple triangles correspond to the SMF of 1 Yang et all (120091 ) . extracted over a larger volume 
(z < 0.2). Both the Baldry at el. and Yang et al. SMFs are based on the NYU-VAGC 
galaxy catalog. See §3.31 for a discussion of the comparison. 
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Fig. 9. — The cyan symbols with error bars represent the differential HIMF derived in this 
work from the Hl-selected sample (same as cya n symbols in Fig. [6) . The solid purple line 
corresponds to the Schechter function fit to the Martin et al.l (120101 ) HIMF, which is based 
on the full a. 40 catalog of ALFALFA sources and witho ut any optica l select ion cuts. The 
green dashed line corresponds to the Schechter fit to the IZwaan et al.l (120051 ) HIMF, based 
on 4315 galaxies detected by the HIPASS survey. See §3.3l for a discussion of the comparison. 
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bols in Fig. [6]) with the HIMF of iMartin et al.l (120 10f ) derived from 10119 galaxies detected 
by ALFALFA over ^2600 deg 2 of sky (purple solid line). Ther e is excellent agreem ent at 



intermediate and large HI masses (M HI > 10 8 5 M o ) between the Martin et al.l (120101) HIMF 



and the HIMF derived in this work, while at lower masses the IMartin et al. ((20101) HIMF 
is slightly steeper than ours. This disagreement may be due to the set of additional optical 
requirements imposed on our Hl-selected sample. As argued in §2.1[ these requirements are 
expected to reduce the number of low-mass systems in the sample and therefore decrease 
th e inferred spa c e den sity at the low-mass end. The dashed green line represents the HIMF 
of IZwaan et al.l (120051 ) based on 4315 sources detected by the HI Parkes All-Sky Survey 
(HIPASS) over the whole southern celestial hemisphere (~ 29000 deg 2 ). Ther e is disagree- 



ment between both ALFALFA-based HIMFs and the HIPASS-based HIMF of Zwaan et al. 



( 120051 ). at the high-mass end. As argued in IMartin et al.l ( 120101 ). the higher sensitivity of 
the ALFALFA survey compared to HIPASS, which enables ALFALFA to detect Hi-massive 
systems over a larger volume, should give a statistical advantage to the ALFALFA sur- 
vey in determining the high-mass end of the HIMF. However, the difference is too large to 
be explained by count in g statistics or cosmic varia nce (e.g. according to the estimates of 
Somerville et al.ll2004l or iDriver fc Robotharnll2010l ). On the other hand, due to the expo- 
nential drop-off of the HIMF at high masses, a flux calibration difference of as low as 0.1 
dex could give rise to a similar discrepancy. 



4. Uncertainties & systematics 
4.1. Stellar mass estimator 



A variety of methods exist t o estimate stellar masses from spectra or broadband photo- 



metric measurements of galaxies ( 


Kauffmann et al. 


2003; 


Bell et al. 


hoc 


3; 


Brinchmann et al. 


2004 




Glazebrook et al. 


2004 




Gallazzi et al. 


2005; 


Panter et al. 


2007 




Salim et al. 


20071. to 



name a few) . Most methods rely on comparing the actual galactic emission to the light out- 
put of a set of galactic stellar population models. The models that most closely reproduce 
the observed data are then used to estimate the galactic properties of interest (e.g. stellar 
mass, present star formation rate, internal extinction etc.); it is therefore very important 
to consider model stellar populations which span as large a range of physical parameters 
as the galaxies in the sample being studied. Inferred galactic properties depend not only 
on the particular type of data employed by each method (e.g. spectroscopy vs. broadband 



discussed and quantified in iPapastergis et al.l ([201 ll §4.3). 
J http : / / sdss . physics . nyu . edu/vagc/ 
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Fig. 10. — The gold solid line with error bars represents the differential SMF derived 
from the optically- selected sample in this work, using stellar masses based on SED-fitting 
(IHuang et al.ll2012l ). The gold dashed line represents the SMF computed from the same sam- 
ple but using stellar masses d erived from the g alactic g — r color and the i-band luminosity, 
according to the widely used iBell et all (120031 ) calibration. The gold dotted line represents 
the SMF computed using stellar masses derived from the g alactic q — i color a nd the i-band 
luminosity, according to the the more recent calibration of iTaylor et al.l (1201 lh . See §4. II for 
a discussion of the comparison. 
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photometry or optical vs. near- infrared photometry), but also by differences in the way in 
which the model stellar populations are constructed. This means that different methods can 
yield different estimates of a galactic property e ven when the same observational measure- 
ments are used. For example, iPforr et al.l (120121 ) find that unbiased stellar masses can only 
be recovered when the true star formation history (SFH) of a galaxy is known. In practice 
however a restricted set of SFHs is considered (often in the form of a parametrized func- 
tion), which may introduce systematics for galaxies with SFHs that are not well described 
by the assumed general form. Additional complications can be introduced by the different 
treatment of dust reddening among different models. In general, stellar mass estimates can 
differ systematicall y by as much as .3 dex, while for individual galaxies the scatter can be 
as large as 0.6 dex (IPforr et al.ll2012f ). 




Fig. 11. — A galaxy- by-galaxy comparison of stellar masses derived from SED fitting of the 
SDSS u, g, r, i, z bands used in this work (see §2.31) . and those derived from the g alactic g — r 
color and the i-band luminosity according to the widely used iBell et al.l ( 120031 ) calibration. 
Each datapoint corresponds to a galaxy in the optically-selected sample, and the symbol color 
represents the g-r color of the galaxy. The two stellar mass estimates agree fairly well for red 
passive galaxies, while for blue star-forming galaxies Bell et al. masses are systematically 
larger by up to a factor of ~2 (see §4.11 for a detailed discussion). 



Here we compare the optically-selected SMF presented in this work (computed from 
stellar masses derived from SED-fitting, see §2.3p with the SMF obtained using stellar masses 
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derived from a single galactic color, accor ding to the w i defy- u sed iBell et al.l (120031 ) calibration 
as well as the more recent calibration of iTaylor et al.l (120111 ) . More specifically, we compute 
Bell et al. masses by multiplying the z-band luminosity of each galaxy by a mass-to-light 
ratio inferred from its g — r coloijj. We choose this particular combination of bands because 
i t is relatively imm une to contamination of galactic spectra by bright nebular emission lines 
(IWest et al.ll2009l ). We use a similar procedure to calculate Tayl or et al. masses, by using 
their calibration of z-band mass-to-light ratio versus g — i color. ITaylor et al.l (120111 ) argue 
that using g — % colors best constrains the galactic stellar mass estimates. 

Figure [10] shows the impact that different methods of estimating stellar mass have on 
the measurement of the SMF. When Bell et al. stellar masses are used (gold dashed line), the 
SMF becomes systematically higher at low and intermediate masses (M* < 10 11 M ), while 
it remains mostly unchanged at the high-mass end. The reason for this pattern becomes 
evident in Fig. [TTJ where we see that Bell et al. masses agree fairly well with the masses 
derived in this work for red passive gal axies, but are system atically larger (by up to a factor 
of ~2) for blue star-forming galaxies. iHuang et al.l (120121 ) argue that the difference can be 
primarily attributed to the fact that the stellar population models used for the Bell et al. 
calibration do not consider "bursty" star formation histories which are typical of low-mass 
galaxies with blue colors. This leads to systematically older stellar populations for blue 
galaxies according to the Bell et al. method, which in turn results in systematically higher 
stellar mass estimates. Note that including models with bursty SFHs in a stellar population 
library does not by itself guarantee a correct estimate of stellar mass; overestimating the effect 
of bursts would result in systematically low stellar masses for blue galaxies. Conversely, when 
Taylor et al. masses are used (gold dotted line), the SMF becomes systematically lower at 
the high-mass end, while it is mostly unchanged at low and intermediate masses. Again, this 
is a result of the fact that Taylor et al. masses agree well with the SED-fitting masses used 
in this work for blue star-forming galaxies, but are systematically lower (by up to a factor 
of ps1.4) for red passive galaxies. 



4.2. Distance uncertainties 

Stellar, HI and baryonic masses are distance-dependent quantities and hence their sta- 
tistical distributions are affected by distance uncertainties. This is particularly true for the 



6 We use SDSS colors, computed from Galactic extinction-corrected model magnitudes (modelmag), to 
calculate mass-to-light ratios in the i-band. i-band luminosities are then calculated from the i-band Petrosian 
magnitudes reported in SDSS (petromag), corrected for Galactic extinction according to the values listed in 
the SDSS database. The solar absolute magnitude in the i-band is taken to be Mq^ = 4.57. 
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Fig. 12. — The cyan solid line with error bars represents the differen tial HIMF der ived 
from the Hl-selected sample in this work, using flow model distances (jMasters 2005 ) for 
most nearby galaxies. The cyan dashed line represents the HIMF computed from the same 
sample but using simple Hubble distances for all galaxies. Distance uncertainties affect 
primarily the mass estimates of nearby galaxies and so the main effect is a change of the 
low- mass end slope of the distribution (see §4.21 for further discussion). 
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low-mass end of the distributions, which is determined by the properties of nearby galaxies; 
neglecting the peculiar velocity of some of these objects can cause fractional distance and 
mass errors of order w 100%, especially in a volume with complex large-scale structure such 
as the one surveyed by ALFALFA (see Fig. [T]). 

For this reason, nearby galaxies (vcmb ^ 6000 km s -1 ) i n the a. 40 cata log are assigned 
distances based on a parametric peculiar velocity flow model ( Masters! 120051 ). and only more 
distant galaxies {vcmb > 6000 km s' 1 ) are as si gned simple Hubble distances according to 
their CMB recessional velocity. The iMastersl (120051 ) flow model includes a dipole and a 
quadrupole component (local group bulk motion & asymmetric expansion) and two local 
attractors (Virgo cluster & Great Attractor), and is calibrated against the SFI++ catalog 
of galaxies with redshift and independent distance distance information (from Tully-Fisher) . 
The residuals are then attributed to random thermal motions, which are estimated to have 
a magnitude of ai oca i = 160 km s _1 . In addition, distances reported in the a. 40 catalog take 
into account known group and cluster membership as well as primary distance information 
published in the literature. This latter information is not available for the majority of the 
galaxies in our optically-selected sample (which are not included in a. 40), and we only 
make an attempt to assign all probable Virgo members to the Virgo cluster distance (D = 
16.5 Mpc). 

Here we re-evaluate the HIMF for our Hl-selected sample using uniformly Hubble dis- 
tances for all galaxies, in order to illustrate the impact of the distance assignment scheme 
on the derived distributions. Figure H2] shows that the HIMF computed using Hubble dis- 
tances (cyan dashed line) has a much shallower low-mass end slope compared to the HIMF 
presented in this work, which uses flow model dista nces for most nearby galaxies (cyan solid 
line). This result is in agreement with the work of iMasters et al.l ( 120041 ). who find that ne- 
glecting the local peculiar velocity field when estimating distances to nearby galaxies in the 
ALFALFA volume will lead to a systematically shallower low-mass end slope for the HIMF. 



4.3. Molecular & ionized gas 

Throughout this article we use the term "baryonic mass" to refer to the sum of the stellar 
and atomic gas mass (M& = M* + 1.4 Mm), a convention that is common in the literature. 
This definition however excludes a number of baryonic components that are definitely present 
in galaxies, most notably molecular and ionized -warm or hot- gas. 

Figure [13] displays the fraction of molecular hydrogen (H2) mass (accounting for helium) 
to "baryonic mass" as defined above, as a function of stellar mass. The cyan diamonds rep- 
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Fig. 13. — Ratio of molecular gas mass (corrected for the abundance of helium) to the 
"baryonic mass" computed as Mb = M* + 1 .4 Mgj. Cyan diamonds represent 14 galaxies 



of the HERACLES survey (ILeroy et al.ll2009l ) detected in CO line emission. The solid and 
dotted blue lines represent the average and 2a sca tter of the distributio n found for 125 
CO detected galaxies in the COLD GASS survey (jSaintonge et al.ll201ll ). Both samples 
show that, at least for M* > 1O 8 ' 5 M , molecular gas is almost always a subdominant mass 
component (see §4.31 for details). 
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resent 14 galaxies from the HERACLES survey (ILeroy et al.ll2009l ) with H2 masses measured 
from interferometric CO line observations (using a fixed aco conversion factor). Over the 
probed stellar mass range (M* > 1O 8,5 M ), molecular gas rarely contributes more than 10% 
of the "baryonic mass". The same concl usion is reached when H 2 mass measurements from 
the COLD GASS survey are considered (ISaintonge et al.ll201ll ). The blue solid and dotted 
lines show the mean and 2a scatter of the relation between the molecular and "baryonic" mass 
components, based on 125 galaxies detected in CO emission with the IRAM 30m telescope. 
In this latter case we have estimated the atomic hydrogen mass of galaxies i ndirectly, using 



the a verage Mhi/M* vs. M* relation of the COLD GASS parent sample (ICatinella et al. 



20101 ). Again, over the stellar mass range probed by the survey (M* 



10 



in 



10 1L5 M, 



molecular gas is always a sub dominant mass component. At lower stellar masses there is 
large uncertainty on the fractional contribution of H 2 , as it is not precisely known how well 
the galactic CO emission traces molecular hydrogen mass. In particular, the aco conversion 
factor may vary by about an ord er of magnitude as we consider less luminous and more 
metal poor late-type galaxies (e.g. iBoselli et al.ll2002l ). 



Determining the contribution of ion ized gas to the to tal baryonic mass budget of galaxies 
is much more challenging. For example iReynoldd (120041 ) argue that warm ionized hydrogen 
(HII) may amount to about 1/3 of the mass of atomic hydrogen (HI) in the disk of the 
Milky Way. If the ratio of ionized-to-neutral hydrogen mass (fan) were fairly constant 
among galaxies, then the baryonic mass of a galaxy would be given by the expression M& = 
M* + 1.4 (1 + fg//)Mgj. If f an m 0.3, then the peak value of the % - relation (see Fig. 

HI HI 

[T71) would shift to lower halo mass and the peak value would slightly increase. However, 
since the precise value and scatter of f mi is not well constrained -and its dependence on 
galaxy size is not known- we choose not to include the contribution of warm ionized gas in 
the calculation of 

Assessing the contribution of the hot ionized medium (HIM) to the total baryonic mass 
of a galaxy is even more challenging. The coronal HIM may be the dominant baryonic 
mass component in galactic halos, especially in massive ellipticals. Determining the mass 
contribution of the HIM for less massive galaxies however is observationally challenging. In 
any case, the tightness of th e "baryonic Tul l y-Fisher relation " when computed just from 



the stellar and HI mass (e.g. iMcGaughl 12012c lHall et al.ll2012l ) implies that the HIM never 



dominates the total baryonic mass budget of late-type galaxies, at least within the extent of 
the galactic HI disk. 
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5. The stellar & gas content of DM haloes 



5.1. The abundance matching method and its application 

Let ./Vg a i(Mb) be the cosmic number density of galaxies with baryon mass greater than 
Mb and let Nh(Mh) be the cosmic number density of haloes wi th mass greater than Mh . 



The fundamental assu mption of the abundance ma tching method (jMarinoni &: Hudsonll2002 



Vale fc Ostrikerll2004J : also see iBehroozi et al.ll2010l for a review) is that Mb is a monotonically 
increasing function of M^. With this assumption, M& (Mh) can be determined by solving the 
equation 



iV ga i(M b ) = N h (M h ). 



(4) 



In reality, the baryon content of a halo will depend not only on its mass but also on other 
parameters, such as its formation history. As a result, a scatter in the distribution of M& at 
a given M h is expected. Neglecting the scatter is, nevertheless, justifiable because the aim 
of abundance matching is precisely to determine the average value of M& within a halo of 
mass M h . 

We evaluate the right hand side of equation HJ using a halo mass function extracted 
from one of the cosmological N-body simulations o f the Horizon Project^. The simulation 
was run with a public version of the GADGET code (jSpringelll2005l ). and uses 1024 3 particles 
of mass m p ~ 8.5 10 7 M Q to simulate the formation and evolution of DM structures in a 
comoving volume of 100 h~ x Mpc on a side. It assumes a cosmology and initial conditions 
which are consistent with Wilkinson Microwave Anisotropy Probe (WMAP) third year re- 
sults JSpergel et al.l l2007h . namely h = 0.73, Q A = 0.76, Q m = 0.24, and a 8 = 0.76. The 
identifica tion of DM h a loes a nd sub-haloes was done with the adaptaHOP algorithm pre- 
sented in lAubert et al.l (120041 ) . Haloes are identified as groups of particles above a threshold 
over-density of 80 times the mean density of the universe, which corresponds to a mean over- 
density contrast of abo ut 200. The identifi cation of subhaloes within haloes is done using 
the method described in lTweed et al.l (120091 ) . We only keep haloes and sub-haloes with more 
than 20 particles, i.e. we introduce a minimum halo mass of Mh ~ 1.7 10 9 M & . 



An important issue for our analysis is determining whether subhaloes should be included 
or excluded in the calculation of the HMF used in the abundance matching procedure. Let 
M^(Mh) be the stellar mass of the central galaxy in a halo of mass M^ and let M^(Mh) be the 
stellar mass of the satellite galaxy in a subhalo of mass Mh- If subhaloes are excluded, then we 



7 http : / / www . pro j et-hor izon . f r 
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Fig. 14. — Effects of the choice of stellar and halo mass functions on the M* - relation. 
The black solid line shows the M* - Mh relation obtained by matching the stellar mass 
function of central galaxies to the mass function of haloes excluding subhaloes. This matching 
should reproduce the "true" relation for central galaxies. The blue dot-dashed line is the 
relation obtained by matching the total galaxy stellar mass function, including central and 
satellite galaxies, to the mass function of haloes excluding subhaloes. The red dashed line 
is the relation obtained by matching the total galaxy stellar mass function with the mass 
function of haloes including subhaloes. Note that, unlike all other figures, this figure uses 
the stellar mass functions of lYang et al.l (120091 ) . who have separately measured the SMF for 
central and satellite galaxies. 
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Fig. 15. — The differential SMFs derived from our optically selected sample (yel l ow cu rve) 
and our Hl-selected sample (cyan curve) compared to the results of lYang et al.l ( 120091 ) for 
the total SMF (yellow symbols) and the SMF for central galaxies only (cyan symbols). 
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implicitly assume that all galaxies in our samples are central galaxies, that is, M^(Mh) = 0. 
On the other hand, if subhaloes are included, we implicitly assume that the Mf, - relation 
is the same for both central and satellite galaxies, that is, M^(Mh) = MJ(M^). 

In order to understand how much this choice could affect our results, we consider the 



stellar mass functions of lYang et al.l ( 120091 ). for which separate distributions for central and 
satellite galaxies have been presented. This allows us to calculate the M* - Mh relation in 
three ways: firstly, we can match the SMF of central galaxies with the halo mass function 
excluding subhaloes; this method will give us the correct M* - for central galaxies, 
shown by the black solid line in figure HH Then, we consider the total SMF of central 
plus satellite galaxies and match it with the halo mass function excluding subhaloes. This 
is equivalent to assuming that all galaxies are central and overestimates the M* - for 
central galaxies (blue dot-dashed line in Fig. fH|) . Finally, we can match the total SMF of 
central plus satellite galaxies with the total halo mass function including subhaloes. The 
result, shown by the red dashed line in Figure [TH lies below the black solid line because it 
is effectively a weighted average of the relation for the dominant central galaxy population 
(the black solid line) and the relation for the satellite population, which has lower M* for 
a given M^. Quantitative comparisons of M*(Mh) and M^(Mh) ha ye received the attention 



of much recent literature (see e.g. Cattaneo et al. 2012, submitted. iRodriguez-Puebla et al. 
2012L iReddick et alil2012h . 



As we do not distinguish between central and satellite galaxies in the samples used in 
this work, we shall choose the third method (the one that corresponds to the red dashed 
line) as our best estimator of the M* - Mh relation and the M& - Mh relation. This choice will 
introduce some systematic bias which is, nonetheless, smaller than the typical uncertainty 
involved in the determination of the M* - Mh relation. 

We also considered whether abundance matching our Hl-selected SMF with the HMF 
excluding subhaloes would give consistent results with our fiducial abundance matching re- 
sult, obtained by matching our optically-selected SMF and the HMF including subhaloes. 
Physically, this consideration was motivated by t he fact that satellite ga laxies tend to be red- 



der than central galaxies of the same mass (e.g. IWeinmann et al.ll2006l ). and so Hl-selection 
may be equivalent to the exclusion of satellite galaxies from an observational sample. How- 
ever, the compari son in Fig. [15] of our Hl-selected SMF with the SMF for central galaxies of 



Yang et al.l (120091 ) shows that this argument may not be valid. We note that the comparison 
between the Yang et al. SMF for central galaxies and our Hl-selected SMF is subject to 
observational systematics, such as the distance assignment scheme or the stellar mass esti- 
mator; for example using Hubble distances for the galaxies in our Hl-selected sample (see 
Fig. [T2]) would bring the two distributions in fair agreement. 
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5.2. Results 



Our abundance matching analysis produces two main results: 



1. We determine M^/M^ as a function of by comparing the galaxy stellar mass func- 
tion (SMF) from our optical sample to the total halo mass function (including sub- 
haloes) . 

2. We determine M^/M^ as a function of by comparing the baryonic mass function 
(BMF) from our optically selected sample to the total halo mass function (including 
subhaloes) . 



The latter relation is the focus of this paper, but we first discuss the former because 
the results can be compared to an extensive literature of previous studies. The consistency 
of our findings with previous work on point [JJ boosts our confidence that our conclusions on 
point [2] are robust. 



Figure [16] shows our result for M*/Mh as a function of Mh, obtained from our opti- 
cally selected sample (gold thick solid line). Our analysis extends to halo masses as low 
as M h w 10 10 ' 5 M Q , since both our optically-selected and Hl-selected samples probe stellar 
masses down to M* ps 1O 7 M . At the high mass end, our relation stops at M h ps 1O 14 M 
because our galactic samples are drawn from a relatively small volume, and are not appro- 
priate to measure the abundance of the most massive galaxies and clusters. We find good 
agre ement with previou s estimates from ab undance matching o btained in the same man- 
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ner (IMoster et all 12010c cyan da shed line; 



green solid line; Leauthaud et al. 20121 red solid line; Baldry et al. 



Behroozi et al. 2010: blue dotted-dashed line; 



200 



purple dotted line). The thick yellow dash-dotted and dashed lines are shown to illustrate 
the systematics introduced by the choice of HMF and SMF: the former represents the abun- 
dance matching result when our fiducial SMF is matched the HMF excluding subhaloes; the 
latter is the result of matching the Yang et al. SMF with our fiducial HMF, which includes 
subhaloes. 

We also compare our average M* - Mh relation with values measured for individual galax- 
ies. The small cyan, red and blue circles correspond to galaxies with measurements of their 
halo mass Mh from weak lensing (IMandelbaum et al.ll2006l ; iHoekstral 120071 : iLeauthaud et al. 
2010j). The star symbols correspond to galaxies for which Mh was estimated form stellar 



8 The iBaldry et al.l (|2008l ) abundance matching result uses a "galactic" halo mass function by 
Shankar et al. I |2006h . 
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COMPARISON WITH PREVIOUS WORK 
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Behroozi et al. 2010 

Evoli etal. 2011 

Leauthaud et al. 2011 



Mandelbaum et al. 2006 
Hoekstra 2007 
Leauthaud et al. 2010 
Conroy et al. 2007 
More et al. 2011 
Geha et al. 2006 
Pizagno et al. 2007 
Springob et al. 2007 

Reyes et al. 2012 



0- 



A A 



-2 




THIS WORK 
Optically sel 
Optically sel 



gal./haloes+subhaloes 
gal. /haloes only 



Yang et al. (2009) SMF/haloes+subhaloes 



10 



11 



12 13 

LogM h /M e 



14 



15 



16 



Fig. 16. — The ratio of galactic stellar mass to halo mass as a function of host halo mass 
(M*/Mh — Mh relation). The thick yellow line shows our main result, obtained from abun- 
dance matching the stellar mass function of our optically-selected sample with the halo mass 
function including subhaloes. The yellow dashed and dash-dotte d lines correspond to varia- 
tions of our main result, obtained by considering the lYang et al.l (120091 ) SMF and excluding 
subhaloes from the HMF respectively, and are shown to illustrate uncertainties. The ma- 
genta dotted, cyan dashed, blue dot-dashed, green so l id and red so l id lin e s correspond to 
the ab u ndance mat c hing r esults of 



Baldrv et all hooj) , iMoster et all feoioh . iBehroozi et al. 



( 120101 ). lEvoli et al.l ( 1201 ll ) and iLeauthaud et al.l ( 120121 ). respectivel y. The big green c ircles 
are the results of a stacked weak lensing study of SDSS galaxies by iReyes et al.l ( 120121 ). All 
other data points refer to measurements for individual galaxies: the sma ll circles correspond 



to ga laxies with halo mass meas urements fr om weak lensing stud ies (IMandelbaum et al. 



20061 : cyan circles; lHoekstrall2007l : red circles; ILeauthaud et al.ll2010l : blue circles). The star 



(Conrov et al. 


2007 


: cyan stars; 


More et al. 


2011 



with halo masses de termined from the disc rotation spee d ( iGeha et al.ll2006t cyan triangles 



Pizagno et al.l 120071 : red triangles: ISpringob et al.l 120071 : blue triangles). The dotted-dashed 
horizontal line shows the cosmic baryon fraction /& f» 0.16. 
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dynamics (jConroy et al.l 120071 ; iMore et al.l 120 111 ). The tria ngles correspond to disc galaxies 



for which Mh was determ ined from their rotation speed (IGeha et al 



20061 ; IPizagno et al. 



20071 ; ISpringob et al.ll2007l ). We remark that, while results for individual galaxies have large 
scatter, they seem to be systematically lower than any of the relations inferred from abun- 
dance matching (at least over the halo mass range M h = 10 11 - 1O 12 M ). Furthermore, 
the halo mass for which M*/Mh has a maximum appears to be higher when inferred from 
measurements of individual galaxies compared to the value derived from abundance match- 
ing: in the first case the peak is at Mh « 1O 12,5 M , while in the second case it is at 
Mh ~ 10 1 2 M(D. There is also slight disagreement of all abundance matching results with the 
results of iReyes et al.l (j2012h (large green circles), who used a stacked weak lensing analysis 
of over a hundred thousand disk galaxies in SDSS separated in three bins of stellar mass. 
Regardless of the method used, however, there is a clear consensus that M*/Mh is much lower 
than the universal baryon fraction /;, « 0.16 (horizontal black dash-dotted line in figure [TEj) . 
at all halo masses. 

Let us now examine the results for the M^/Mh - Mh relation (Fig. H7|) . The gray shaded 
area shows the relation derived from our optically-selected sample, matched to a halo mass 
function that includes both haloes and subhaloes: its upper and lower envelopes correspond 
to the distribution for M™ ax and M™ n , respectively, as defined in § 12.31 The thick gold 
solid line represents the relation for the stellar mass (same as in Figure IT6|) and has been 
added for reference. Figure IT7l puts in evidence the fact that rjb decreases monotonically with 
decreasing halo mass, despite the fact that atomic gas contributes progressively more to the 
baryonic mass budget of galaxies. 



In Fig. [18] we compare our results to those by iBaldry et al.l (120081 ) and lEvoli et al. 

( 1201 ll ). who derived their M b /M h - M h relations from the equation 



Mb _ M*_ L ,M gas 
M h M h \ M* 



(5) 



In Eqn. [5] the baryonic mass is computed from the stellar mass, using the mean gas-to-stellar 
mass ratio for galaxies as a function of stellar mass. To enable a cleaner comparison, we have 
made the exercise of re-deriving the MiJMh - Mh relatio n from our M^/Mh - Mh relation, 



using the same procedure followed by lBaldry et al.l (120081 ) (as in Eq. [5]). The result is shown 
by the black solid line in Fig. [18j The main difference between the results obtained by using 
individual galaxy gas masses (gray shaded region) and by adopting a mean gas-to-stellar 
mass ratio (black solid line) seems to be an artificial flattening of the M^/Mh relation at low 
masses. This is probably due to the fact that the latter method ignores the large scatter of 
galactic Mhi/M* values from the mean, and leads to the incorrect interpretation that the 
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Fig. 17. — The baryon fraction of galaxies, including stars and atomic g function of 

their host halo mass (Mb/Mh — Mh relation). The gray shaded area shows the results of an 
abundance matching analysis of our optically-selected sample. Its boundaries correspond to 
two extreme assumptions for the gas content of galaxies detected optically but not in HI: i) 
galaxies that are not detected in HI contain no gas (lower boundary) and ii) galaxies that are 
not detected in HI contain the largest amount of gas that could have escaped detection from 
ALFALFA (upper boundary). The thick solid yellow line is the M^/M^-M^ relation (same 
as in figure [161). and is shown for comparison. The dotted-dashed line shows the baryon 
fraction that lOkamoto et al.l ( 120081 ) predict based on hydrodynamic simulations that include 



cosmic reionisation. 
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Fig. 18. — The gray shaded area represents the Mb/Mh - Mh relation derived in this work 
from our optically-selected sample (same as Fig. 
solid lines show the M^/M^ - Mh relations that 
derived from their M*/Mh - Mh relations (Fig. H6l) . using the mean gas-to-stellar mass ratio 
as a function of stellar mass to account for the gas content of galaxies. The black solid 
line correspond to the results obtained from our optic ally-selected sample when we u se the 
same method. The red solid line represents the result of iRodriguez-Puebla et al.l (1201 if ), who 
used separate fui - relations for blue and red central galaxies. Lastly, the thin black 



dotted -dashed line is the same as Fig. [18] and shows the baryon fraction that lOkamoto et al. 



( 120081 ) predict based on hydrodynamic simulations that include cosmic reionisation. 
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baryon retention fraction (rjb) of low-mass halos asymptotes to some fixed valu e. The red 
solid line in Fig. [18] corresponds to the result of iRodriguez-Puebla et al.l (120 111 ), who used 
a separate mean Mhj/M* - M* relation for blue and red galaxies. Their resuho shows no 
signs of flattening at low masses, however the relation only extends down to = 10 11 M & . 
Independently of the used method however, all results point to values of rjj, that are well 
below unity, and cannot be explained by the effects of cosmic reionization alone (black 
dotted-dashed line in Fig. [18] see Sec. 15.31 for details). 




Fig. 19. — Hl-to-st ellar mass ratio vs. galax y stellar mass. The symbols with error bars 
show the results of ISwaters fc Balcellsl (120021. blue c i rcles) , iGarnettl (120021 . green circles), 



Noordermeer et al.l (120051 . red circles) and lZhang et al.l (120091 . magenta circles) for the average 
Hl-to-stellar mass ratio in bins of stellar mass. The black line is a power-law fit to these 
data points. The blue contours represent the distribution of Hi-fraction (Mhi/M*) for 
the galaxies in our optically-selected sample that are detected by ALFALFA, while the red 
inverted triangles are maximum Hi-fractions (M^f x /M*) for a representative subset of our 
optically-selected galaxies that are not detected by ALFALFA. 

The (Mg as /M*) relation that we use to evaluate the r ight hand side of Eqn. [5] (b l ack solid 
line in Fig. IT9]) is a power-law fit to the Mhi/M* data by lSwaters fc Balcellsl (120021 ) . iGarnett 



Note that the IRodriguez-Puebla et al.l (|2011l) result refers to central galaxies only. 
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(J2002I), iNoordermeer et all fl2005f ) and lzhang et al.l f ]2009h . The best fit relation, plotted as 
a thin solid black line in Fig. [191 is given by log(M H7 /M*) = -0.43 log(M*/M Q ) + 3.75. 
The blue contours in Fig. [HH represent the distribution of Mhi/M* values for the galaxies in 
our optically-selected sample that are detected by ALFALFA. Since the ALFALFA survey is 
a blind HI survey with a fixed integration time per pointing, the ALFALFA distribution is 
expect ed to lie above the data obtained by pointed observations of optica lly-selected galaxies 



as m 



Swaters fc Balcellsl |2002| ; iGarnettl |2002| ; INoordermeer et aJj 120051 ) . The inverted red 



triangles correspond to the maximum HI- fraction (M^f x / 'M*) for a representative subsample 
of our optically-selected galaxies that lack an ALFALFA detection. Note that these upper 
limits are also systematically higher than the relationship indicated by the power-law fit. 



5.3. Discussion 



The main result of this paper is the large "gap" between the present-day baryon fraction 
of galaxies in low-mass halos and the cosmic value (/{, ~ 0.16), which is present even when 
the atomic gas content of the galaxies is taken into account. This result is not contrived, 
given that atomic gas dominates the baryonic mass budget of galaxies with Mh S, IP 11 



Moreover, the low-mas s behavior is in disagreement with previous studies (e.g. iBaldry et al. 



2008 



Evoli et al.l 1201 ll ). who find that the rjj, - Mh relation flattens out at low masses, and 
approaches a roughly constant value of r] b w 10%. These previous results would then require 
an exceptionally low efficiency of gas-to-stars conversion in low mass systems to explain the 
observed values of 7/*, which decrease monotonically. 

In Fig. [T7] we compare our result for the baryon fraction of halos to the predictions at 
z = of a cosmol ogical hydrodynamic simulation that includes heating from a photoionizing 
UV background (jOkamoto et al.l 12008k black dotted-dashed curve). At the high-mass end 
(M h > 10 13 M m ), the low values of Vb can be explai ned by the fact that infalling gas is shock- 



heated dKeres et al 



galactic nuclei (ICroton et al 



2005 



2006 



Dekel fc Birnboiml 120061). and is kept hot by feedba ck from active 



Cattaneo et al. 1 120061 ; iBower et al.l 120061 ) . This picture 



is supported by considerable observational evidence in the case of X-ray grou ps and clusters, 
but t he situation within individual galaxies is not so well understood (see ICattaneo et al. 
2009 for a review). 



At the low mass end, photoionization heating is expected to become important, since 
the intergalactic medium is too hot to fall into the shallow potential wells of haloes with 
Mh < 1O 1O M , and their baryon fraction is heavily suppressed. However, this process 
alone cannot account for the shape of the rjb - Mh relation at low masses, especially for 
the onset of a sharp decrease in the M^/Mh ratio at relatively large halo masses (M/, ~ 
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10 Mq). Additional feedback is therefore needed, usually attributed to the ejection of 
baryons by stellar-driven winds. Semianalytic models of galaxy formation bas ed on this 



assumption reproduce a good fit to luminosity functions in the local uni verse (IGuo et al. 



201ll : Benson fc Bowerll2010l ; ISomerville et al.l I20Q8I : ICattaneo et al. 1120061 ) but the implied 
outflow rates are enormous. To reproduce the result presented in Figure [T7J the outflow 
rate in a halo with Mh ~ 10 10 3 M e must be of order a hundred times higher than the star 
formation rate. It is difficult at present to reproduce such outflow rates in hydrodynamic 
simulations. Moreover, observational estimates place the total mass of outflows in normal 
star- forming galax ies at approximately the same level as the galaxies' final stellar mass 
(IZahid et al.ll2012l ). Even in the most extreme observational cases, the "mass-loading factor" 
(the r atio of mass loss rate due to outflows over the star formation rate) is estimated to be 
< 10 (INewman et al.l 120121 ). Therefore, explaining in detail the mechanisms responsible for 
the very low rjb values found in low-mass galaxies seems to be a fundamental challenge for 
models of galaxy formation in a ACDM cosmological context. 

Recent cosmological hydrodynamic simulations have challenged this statement, by man- 
aging to reproduce "realistic" galaxies whose properties satis fy a number of obs ervational 
constraints. For example, the high-resolution Eris simulation (IGuedes et al.ll201ll ) has man- 
aged to produce a Milky- Way t ype object with v a lues o f 77* in agreement with those that we 
see in Fig. [T71 More recently, iMcCarthy et al.l (120121 ) has managed to reproduce a pop- 
ulation of ~1000 simulated galaxies with low stellar-to-halo mass ratios (77* < 0.05 at 
Mh w 10 1L3 M Q ), in accordance to observations. Notice, however, that this simulation 
uses a kinetic rather than thermal wind model, in which the initial wind speed is 600km/s 
and the initial mass -loading; factor is a factor of four, by construction . Last ly, the work of 
Guedes et al.l (1201 ll ) has been extended to lower masses by lBrook et al.l (120121 ). who managed 
to produce a pair of dwarf galaxies (M^ rs 1O 1O ' 8 M ), which obey the observed "baryonic 
Tully-Fisher" relation and are therefore expected to have the correct baryon-to-halo mass 
fractions. 

While these studies indicate that we may be heading toward a solution of the discrep- 
ancy between the observed and the expected baryon content of dark matter halos, many 
questions (e.g. the expected outflow rates and re-accretion timescales) remain unanswered. 
Therefore, explaining in detail the mechanisms responsible for the very low rjb values found 
in low-mass galaxies remains an open problem for studies of galaxy formation in a ACDM 
Universe. Furthermore, our measurement of M^/M* provides an additional constraint with 
which cosmological hydrodynamical simulations and semianalytic models will have to con- 
front. 
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6. Conclusions 

We use optical data from the seventh data release of the Sloan Digital Sky Survey (SDSS 
DR7) and 21cm emission-line data from the Arecibo Legacy Fast ALFA (ALFALFA) survey 
to measure the "baryonic mass" (defined as Mj, = M* + IAMjji) of galaxies in the local 
universe, and determine the z = baryon mass function (BMF). We use both an Hl-selected 
and an optically-selected sample (7195 and 22587 galaxies respectively) drawn from the same 
volume, in order to address the effects of sample selection on the derived distributions. We 
find that the main difference consists of the optically-selected stellar mass function (SMF) 
being systematically larger at high-masses than the Hl-selected SMF, and find that this 
difference carries over to the high-mass end of the BMF (see Fig. [5]&[7]). 

We combine the obtained mass distributions with the halo mass function in a WMAP3 
ACDM cosmology, to obtain average values of M*/Mh and M^/Mh as a function of halo mass 
(Fig. ITBl fc IT?]) . Our most important result is that low-mass halos seem to have very low 
galactic baryon fractions compared to the cosmic value (/& = Qt,/Q m m 0.16), even when their 
atomic gas content is taken into account; for example, the average baryon fraction of halos 
with Mh = 10 10 ' 3 M Q is just 2% of the cosmic value (775 ps 0.02), and displays a monotonically 
decreasing trend. This result contra sts with previous indirect measurements of the BMF 



( iBaldry et al.ll2008l ; lEvoli et al.ll201lh . which pointed to an approximately constant value of 



r]b ~ 0.10 at the low halo-mass end. 

Such very low values of rjf, are difficult to reconcile with current models of galaxy for- 
mation. Photoionization heati ng in the early unive rse suppresses the baryonic content of 



halos only at M h < 10 10 M o f lOkamoto et all 120081 ). but this mass is more than an order 



of magnitude smaller than what is required by our result. Therefore, additional feedback 
mechanisms, such as baryon blowout by supernova explosions, must be present and must be 
extremely efficient. It is not yet clear whether hydrodynamic simulations or observational 
results can accommodate such intense galactic outflows in low mass halos. As a result, the 
observed 77^ - Mh relation remains difficult to explain, and may represent a challenge to our 
understanding of galaxy formation and/or the properties of dark matter. 
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